Prosodic and spectral features within segment-based acoustic modeling
نویسندگان
چکیده
Apart from the usually employed MFCC, PLP, and energy feature information, also duration, low order formants, pitch, and center-of-gravity-based features are known to carry valuable information for phoneme recognition. This work investigates their individual performance within segment-based acoustic modeling. Also, experiments optimizing a feature space spanned by this set, exclusively, are reported, using CFSS feature space optimization and speaker adaptation. All tests are carried out with SVM on the open IFA-corpus of 47 Dutch handlabeled phonemes with a total of 178k instances. Extensive speaker dependent vs. independent test-runs are discussed as well as four different speaking styles reaching from informal to formal: informal and retold story telling, and read aloud with fixed and variable content. Results show the potential of these rather uncommon features, as e.g. based on F3 or pitch.
منابع مشابه
The effect of bilateral subthalamic nucleus deep brain stimulation (STN-DBS) on the acoustic and prosodic features in patients with Parkinson’s disease: A study protocol for the first trial on Iranian patients
Background: The effect of subthalamic nucleus deep brain stimulation (STN-DBS) on the voice features in Parkinson’s disease (PD) is controversial. No study has evaluated the voice features of PD underwent STN-DBS by the acoustic, perceptual, and patient-based assessments comprehensively. Furthermore, there is no study to investigate prosodic features before and after DBS in PD. The curren...
متن کاملA Study of the Relationship between Acoustic Features of “bæle” and the Paralinguistic Information
Language users benefit from special phonetic tools in order to communicate linguistic information as well as different emotional aspects and paralinguistic information through daily conversation. Having functions in conveying semantic information to listeners, prosodic features form the essential part of linguistic behavour, manipulating them potentially can play an important role in transmitt...
متن کاملEmotion Classification of Infants' Cries Using Duration Ratios of Acoustic Segments
We propose an approach to the classification of emotion clusters using prosodic features. In our approach, we use the duration ratios of specific acoustic segments—resonant cry segments and silence segments—in the infants’ cries as prosodic features. We use power and pitch information to detect these segment periods and use normal distribution as a prosodic model to approximate the occurrence p...
متن کاملA preliminary study on acoustic correlates of tone2+tone2 disyllabic word stress in Mandarin
This paper investigated the potential acoustic correlates of word stress within a disyllabic tonal sequence, a rising tone followed by a rising tone (Tone2 Tone2) in Mandarin, based on a large corpus with adequate information of stress patterns and prosodic boundary levels. The results showed that a) For Tone2+Tone2 words, features based on tone nucleus were more effective than that of the whol...
متن کاملJoint Modeling of Text and Acoustic-Prosodic Cues for Neural Parsing
In conversational speech, the acoustic signal provides cues that help listeners disambiguate difficult parses. For automatically parsing a spoken utterance, we introduce a model that integrates transcribed text and acoustic-prosodic features using a convolutional neural network over energy and pitch trajectories coupled with an attention-based recurrent neural network that accepts text and word...
متن کامل